Download Digital Audio Effects Applied Directly on a DSD Bitstream
Digital audio effects are typically implemented on 16 or 24 bit signals sampled at 44.1 kHz. Yet high quality audio is often encoded in a one-bit, highly oversampled format, such as DSD. Processing of a bitstream, and the application of audio effects on a bitstream, requires special care and modification of existing methods. However, it has strong advantages due to the high quality phase information and the elimination of multiple decimators and interpolators in the recording and playback process. We present several methods by which audio effects can be applied directly on a bitstream. We also discuss the modifications that need to be made to existing methods for them to be properly applied to DSD audio. Methods are presented through the use of block diagrams, and results are reported. Keywords: Sigma Delta Modulation, SACD, DSD, Digital Audio Effects, Bitstream Signal Processing
Download Automatic Mixing: Live Downmixing Stereo Panner
An automatic stereo panning algorithm intended for live multitrack downmixing has been researched. The algorithm uses spectral analysis to determine the panning position of sources. The method uses filter bank quantitative channel dependence, priority channel architecture and constrained rules to assign panning criteria. The algorithm attempts to minimize spectral masking by allocating similar spectra to different panning spaces. The algorithm has been implemented; results on its convergence, automatic panning space allocation, and left-right inter-channel phase relationship are presented.
Download Automatic Target Mixing using Least-Squares Optimization of Gains and Equalization Settings
The proposed automatic target mixing algorithm determines the gains and the equalization settings for the mixing of a multi-track recording using a least-squares optimization. These parameters are estimated using a single channel target mix, that is a signal which contains the same audio tracks as the multi-track recording, but that has been previously mixed using some unknown settings. Several tests have been done in order to evaluate the performances of two different approaches to the optimization, namely the sub-band estimator and the FIR filters estimator. The results show that, using the latter technique, the proposed algorithm is able to retrieve the parameters originally applied to the target mix. This achievement can be useful for remastering applications, where both the original recording sessions and the final mix are available, but there is the need to retrieve the mixing parameters originally applied to the various audio tracks.
Download Automatic Noise Gate Settings for Multitrack Drum Recordings
A method has been developed for automating the settings of a noise gate. The method has been applied to a kick drum track containing bleed from secondary drum sources and white noise. The optimal settings are found by maximising the signal to distortion ratio (SDR). The SDR has contributions from the distortion caused to the kick drum signal, and the residual bleed and noise. These two components are weighted, enabling the gate to be controlled by a single parameter. It is shown that the improvement in the SDR can be obtained when the two components of the SDR are approximated, enabling the optimal settings to be calculated from the noisy signal and a single kick drum hit. It is found that the optimal threshold is slightly above the peak level of the noise component of the signal.
Download Microphone Interference Reduction in Live Sound
When multiple microphones are used to reproduce multiple sources microphone interference, or bleed, can occur due to each microphone picking up more than one source. This paper proposes combining the crosstalk resistant adaptive noise canceller (CTRANC) algorithm with centred adaptive filters using an estimation of delay to suppress the interference, while making little change to the target signal. The proposed method is compared with similar methods in both the anechoic and echoic cases. The method is shown to outperform the other methods in the anechoic case while in the echoic case it is shown to perform less well at reducing the level of the interference but still introduces the least artefacts. Extension to the proposed method to the N source and microphone case is also discussed.
Download Variable Source Radiation Pattern Synthesis for use in Two-Dimensional Sound Reproduction
In this paper the authors present an approach for two-dimensional sound reproduction using a circular layout of speakers where the gains are obtained from a variable polar pattern. The method presented here has the ability to be variable-order whilst keeping the same key features of a base polar pattern. Comparisons are drawn between the new approach and a previous approach by the authors using variable-order, variable-decoder Ambisonics. The new method is found to not be as directional as the Ambisonics approach, yet it maintains the base polar pattern unlike with Ambisonics. Whilst both approaches have two variable parameters the new approach’s parameters are independent and are therefore intuitive to an end user using such a tool as a spatialisation effect as well as technique.
Download Simulating Microphone Bleed and Tom-tom Resonance in Multisampled Drum Workstations
In recent years multisampled drum workstations have become increasingly popular. They offer an alternative to recording a full drum kit if a producer, engineer or amateur lacks the equipment, money, space or knowledge to produce a quality recording. These drum workstations strive for realism, often recording up to a hundred different velocity hits of the same drum, including recordings from all microphones for each drum hit and including bleed between these microphones. This paper describes research undertaken to investigate if it is possible to simulate the snare and kick drum bleed into the tom-tom microphones and the subsequent resonance of the tom-tom that is caused, with the aim of reducing the amount of audio data that needs to be stored. A listening test was performed asking participants to identify the real recording from a simulation. The results were not statistically significant to reject the hypothesis that subjects were unable to distinguish the difference between the real and simulated recordings. This suggests listeners were unable to identify the real recording in the majority of cases.
Download An Autonomous Method for Multi-Track Dynamic Range Compression
Dynamic range compression is a nonlinear audio effect that reduces the dynamic range of a signal and is frequently used as part of the process of mixing multi-track audio recordings. A system for automatically setting the parameters of multiple dynamic range compressors (one acting on each track of the multi-track mix) is described. The perceptual signal features loudness and loudness range are used to cross-adaptively control each compressor. The system is fully autonomous and includes six different modes of operation. These were compared and evaluated against a mix in which compressor settings were chosen by an expert audio mix engineer. Clear preferences were established for the different modes of operation, and it was found that the autonomous system was capable of producing audio mixes of approximately the same subjective quality as those produced by the expert engineer.
Download Real-time excitation based binaural loudness meters
The measurement of perceived loudness is a difficult yet important task with a multitude of applications such as loudness alignment of complex stimuli and loudness restoration for the hearing impaired. Although computational hearing models exist, few are able to accurately predict the binaural loudness of everyday sounds. Such models demand excessive processing power making real-time loudness metering problematic. In this work, the dynamic auditory loudness models of Glasberg and Moore (J. Audio Eng. Soc., 2002) and Chen and Hu (IEEE ICASSP, 2012) are presented, extended and realised as binaural loudness meters. The performance bottlenecks are identified and alleviated by reducing the complexity of the excitation transformation stages. The effects of three parameters (hop size, spectral compression and filter spacing) on model predictions are analysed and discussed within the context of features used by scientists and engineers to quantify and monitor the perceived loudness of music and speech. Parameter values are presented and perceptual implications are described.